# Replication Package

This is the replication package for "The Optimal Taxation of Couples," by Mikhail Golosov and Ilia Krasikov.
Forthcoming Quarterly Journal of Economics.
This version: 03-20-2025.


## Overview

The code in this replication package analyzes data constructed from IPUMS-CPS (Flood et al., 2022) using R and Julia. Three main files run all of the code to generate the three figures and calibrated parameters in the paper. The replicator should expect the code to run for under 30 minutes.


## Folder Structure

The main folder contains the following subfolders.

    - /scripts			Contains the code files used to generate the figures and values in the paper. See the "Replication Files" section for more details.
    - /data				Contains the raw and intermediate datasets used to produce the analyses of the paper. This folder is initially empty. Before running the code, the user should put the IPUMS-CPS raw file into this folder.
    - /output			Contains the generated figures. The folder is initially empty.


## Data Source

Below we provide instructions to download the data from IPUMS-CPS, University of Minnesota (https://cps.ipums.org/).
Our data uses the 2021 Annual Social and Economic Supplements (ASEC) of the Current Population Survey (CPS).

a) Select sample: 2021 wave of CPS-ASEC.

b) Select variables:
    - RELATE:			Relationship to household head
    - AGE:				Age
    - SEX:				Sex
    - MARST:			Marital status
    - SPLOC:			Person number of spouse (from programming)
    - SPRULE: 			Rule for linking spouse
    - FAMSIZE: 			Number of own family members in household
    - NCHILD: 			Number of own children in household
    - FAMUNIT: 			Family unit membership
    - ASPOUSE: 		Spouse line number
    - PECOHAB: 		Cohabiting partner line number
    - FTYPE: 			Family Type
    - FAMKIND: 			Kind of family
    - FAMREL: 			Relationship to family
    - EMPSTAT: 			Employment status
    - CLASSWKR: 		Class of worker
    - UHRSWORKT: 		Hours usually worked per week at all jobs
    - UHRSWORK1: 		Hours usually worked per week at main job
    - CLASSWLY: 		Class of worker last year
    - WORKLY: 			Worked last year
    - WKSWORK1: 		Weeks worked last year
    - WKSWORK2: 		Weeks worked last year, intervalled
    - UHRSWORKLY: 		Usual hours worked per week (last yr)
    - WKSUNEM1: 		Weeks unemployed last year
    - WKSUNEM2: 		Weeks unemployed last year, intervalled
    - FULLPART: 		Worked full or part time last year
    - INCWAGE: 		Wage and salary income

c) Create and download the data extract as a .dat file. Download also the DDI file (which will have the .xml extension).


## Computational Requirements

The analysis was conducted under Julia Version 1.8.5 and R Version 4.3.1.

We use the following packages and package versions in our analyses using Julia:
    - CSV				v0.8.5
    - DataFrames		v0.22.7
    - Dierckx			v0.5.3
    - Distributions		v0.25.107
    - GaussianDistributions v0.5.2
    - Interpolations		v0.15.1
    - Ipopt				v1.6.0
    - JuMP				v1.18.1
    - KernelDensity		v0.6.8
    - LaTeXStrings		v1.3.1
    - Plots				v1.39.0
    - Roots				v2.1.0
    - StatsBase			v0.33.21

We use the following packages and package versions in our analyses using R:
    - wCorr				v1.9.8
    - tidyr				v1.3.0
    - dplyr				v1.1.4
    - janitor				v2.2.1
    - readr				v2.1.4
    - ggplot2			v3.5.1
    - DescTools			v0.99.59


## Replication Files

The user must first place the resulting .dat and .xml files in the /data folder.

The user must then run "01_clean_data.R", changing the name of the variable "data_name" so that the .xml file is the name of the file that the user downloaded from IPUMS-CPS following instructions in the "Data Source" section.

The user can then run either "02_taxes_data.ipynb" or "03_taxes_computations.ipynb" to generate the figures and tables in the paper.

    - "01_clean_data.R": The program runs in R.
		Processing IPUMS-CPS data for data analysis.
		In particular, we filter the data to include
			(1) households with spouses
			(2) households in which spouses in the data and are living in the same household
			(3) between the ages of 25 and 65
			(4) persons who worked at least 20 weeks last year
			(5) people who are not self-employed or unpaid family workers
    - "02_taxes_data.ipynb": The program runs in Julia.
    		Generates the six panels in Figure 4. Also generates the calibrated values of Table 1 in the second cell of the code file.
    - "03_taxes_computations.ipynb": The program runs in Julia.
    		Generates the four panels of Figure 5 and Figure 6.


## Output

All output is contained in the /output folder.

    - Figure 4: Each panel is given by the following files,
    		(A) fig4a_G.pdf
		(B) fig4b_lefttail.pdf
		(C) fig4c_a1.pdf
		(D) fig4d_copula.pdf
		(E) fig4e_righttail.pdf
		(F) fig4f_a2.pdf
    - Figure 5: Each panel is given by the following files,
    		(A) fig5a_opttaxperc.pdf
		(B) fig5b_opttaxrel.pdf
		(C) fig5c_statusperc.pdf
		(D) fig5d_statusrel.pdf
    - Figure 6: fig6_opttaxshare.pdf
    
The calibrated values of Table 1 are generated in the second cell of the "02_taxes_data.ipynb" code file.


## References

Sarah Flood, Miriam King, Renae Rodgers, Steven Ruggles, J. Robert Warren and Michael Westberry. Integrated Public Use Microdata Series, Current Population Survey: Version 10.0 [dataset]. Minneapolis, MN: IPUMS, 2022. https://doi.org/10.18128/D030.V10.0

